GERALDA: A framework for integration of genomic information into the database of the Brazilian Health System

Majority of Brazilians are of mixed race according to IBGE. Racial and genetic admixture impact demographic information and it is in public interest that genetic information of demographic groups be used to improve the health system (SUS) of Brazil. Integration of genomic and ancestry information of Brazilians into public health data, such as the data provided by Information Technology Department of SUS - DATASUS, can help estimate genetic risk of diseases and propose policies to improve diagnosis, allocation of resources, services, and therapies for population groups at higher genetic risk of diseases. Here we introduce a computational framework in the R language - GERALDA - that used mitochondrial variants to estimate the genetic risk of neuroblastoma and neurodegenerative diseases in Brazilians. This work shows an increased genetic risk of disease and impact on cognition, morbidity, and mortality of Brazilians using mitochondrial DNA variants. This information can be used for the organization of the public health system, contributing to the rational use of resources by the health system.

Gepoliano Chaves https://www.britannica.com/animal/quokka (University of Chicago) , Pablo Ivan Ramos https://www.britannica.com/animal/bilby (CIDACSInstituto Gonçalo Moniz)
2024-10-04

1 Significance

Our group identified hypoxia as a triggering signal for the cellular transition from adrenergic (ADRN) to mesenchymal (MES) cells in neuroblastoma. We found that the transition is mediated by the deposition of the epigenetic marker 5-hydroxymethyl-cytosine (5-hmC) via the ten eleven translocation enzyme (TET1), a functional mechanism of hypomethylation. We are investigating hypomethylation patterns via 5-hmC in immortalized cells, tumor samples and cell-free DNA (cfDNA) isolated from peripheral blood liquid biopsies of neuroblastoma patients. Repetitive genomic elements, or repetitive regions of the genome, are part of the non-coding region and are important for the maintenance of the pluripotent cellular state of stem cells, which we call the de-differentiated state. Mesenchymal stem cells (MES) are dedifferentiated and their gene expression pattern shows a high correlation with the stem and pluripotent cell state.

We evaluated the deposition of the epigenetic marker 5-hmC in repetitive regions of the genomes of cells, tumors, and cf-DNA of neuroblastoma patients for oncological management of patients. We aim to incorporate liquid biopsy with sequencing of 5-hmC in repetitive element markers into the routine of public precision medicine programs for neuroblastoma in Brazil. To this end, we propose the construction of a computational database that incorporates genomic and demographic data available in the health system, to allow better data analysis, machine learning, and artificial intelligence using ohmic marker data for management of nervous system diseases in the public health system, together with the Center for Data Integration and Knowledge for Health of the Gonçalo Moniz Institute at Fiocruz in Bahia.

2 Introduction

2.1 Cellular transition and reprogramming

Neuroblastoma is a pediatric cancer of the peripheral nervous system. Tumors are composed of two main cell lineages: adrenergic (ADRN) and mesenchymal (MES) cells. These cells can interconvert using enhancers and superenhancers Groningen et al. (2017) (van Groningen et al. 2017; Boeva ​​et al. 2017), epigenetic markers in noncoding genome regions identified using machine learning by hierarchical clustering (Van Groningen 2017) and principal component analysis (Boeva ​​et al., 2017). Despite the importance of the mechanism of ADRN-to-MES cell interconversion, the cellular signals that trigger the transition are not understood. ADRN cells are neuron-like and sensitive to chemotherapy (Figure 1, left), while MES cells are similar to undifferentiated or dedifferentiated stem cells (Figure 1, right) and responsible for resistance to chemotherapy and immunotherapy (Van Groningen et al., 2017; Kendsersky et al., 2022; Mabe et al., 2022).

Hypoxia is the condition of limited oxygen supply for tumor growth. Among extracellular signals that activate 5-hmC deposition via TET1, our group at the University of Chicago identified hypoxia as a signal that activates gene expression by 5-hmC deposition and methylation removal (Mariani et al., 2014). Our studies and those of other groups have shown that hypoxia drives dedifferentiation in neuroblastoma cells (Jögi et al. 2002; Mariani et al. 2014; Hains et al. 2022). We identified a functional epigenetic demethylation mechanism mediated by Ten Eleven Translocation (TET) family enzymes, and genes activated by hypoxia in the transition from the ADRN to MES cellular state (Chaves et al. 2024 - in preparation), (Figure 5). Our group’s results propose a functional mechanism of hypoxia-driven methylation removal and gene expression activation (Mariani et al., 2014; Hains et al., 2022; Chaves et al. 2024 - in preparation) (Figures 2 and 5). Our work suggests hypoxia as a mechanism for maintaining cellular dedifferentiation, consistent with the hypomethylation of repetitive transposable elements (Figure 5).

Most CpG islands are located in repetitive intergenic regions of the genome (Lanciano and Cristofari 2020) and, when methylated, serve to maintain an inactive state of DNA transcription (Fetahu and Taschner-Mandl 2021). Repetitive transposable DNA elements (TEs) are mobile genetic elements that make up a large fraction of the genome, reaching 15% in C. elegans and 85% in maize (Lanciano and Cristofari, 2020). These percentages show the potential of these elements as biomarkers (Lanciano and Cristofari, 2020). TEs have been considered “junk DNA” (Burns 2017; Ma et al. 2022) and their value in epigenetic modulation in neuroblastoma and as biomarkers is undetermined.

TEs are active in human embryonic stem cells and, through functional hypomethylation, participate in the activation of gene expression in a mechanism compatible with TET1 activity and 5-hmC deposition (Ma et al., 2022). Retrotransposons such as HERV-H demarcate topologically associated domains (TADs) in chromatin, which maintain the cellular pluripotency state (Zhang et al., 2019). These observations agree with the hypothesis of induction of a stem cell (dedifferentiated) state in hypoxia-activated ADRN cells, involving repetitive or transposable elements, which lead to the expression of MES genes (Figure 5). Important biomarkers can be identified among TE elements, since they represent a high fraction of the genome composition. Thus, the detection of repetitive elements of the genome in cfDNA isolated by liquid biopsy can help in the management of cancer patients (Figure 2).

Studies from our group have identified 5-hmC deposition patterns (Figure 3A) in tumors (Applebaum et al., 2019) and liquid biopsy cfDNA (Applebaum et al., 2020) using the nano 5-hmC-seal method developed by Dr. Chuan He from the University of Chicago (Figure 3B). We propose that the TET1 enzyme, activated by hypoxia, causes oxidation of methylated DNA sequences and formation of 5-hydroxymethyl-cytosine (5-hmC) hypomethylation regions (Figure 3A). Thus, in the present project, sequencing the 5-hmC marker in Brazil will aid in patient management, as done in the USA (Chennakesavalu, Moore, Chaves, et al. 2024). We postulate that it will be possible to quantify ADRN and MES phenotypes in tumors in Brazilians, as we have recently done (Vayani, Chaves et al. 2023).

2.2 Mitochondria in cellular dedifferentiation or reprogramming in the nervous system

Mitochondria are important organelles in cellular metabolism and the citric acid cycle and play a role in pluripotency and differentiation at the embryonic stage (Carey et al. 2015; Hensley, Wasti, and DeBerardinis 2013; Qing et al. 2012; Yoo et al. 2020). Mitochondrial DNA is small and maternally derived, with approximately 16,000 base pairs. Mitochondria provide the cellular supply of ATP for the nervous system, playing an important role in diseases of this system, such as neuroblastoma and neurodegenerations. In neuroblastoma, John Maris’ group conducted case-control studies in Caucasians from the USA and observed that mitochondrial haplogroups (genetic variants) are associated with a reduced risk of the disease (Chang et al. 2020). The same group observed that mitochondrial single nucleotide polymorphism (SNP), rs2853493, is associated with the risk of neuroblastoma, including impacting the expression of the mitochondrial cytochrome B gene, MT-CYB (Chang et al. 2022).

In neurodegenerative diseases, Tranah and collaborators (2012) observed that in elderly Caucasians, different mitochondrial haplogroups present an increased risk of developing dementia and cognitive decline (Tranah et al. 2012). The group reported that in African-Americans, haplogroup L1 represents a greater risk of developing dementia when compared to haplogroup L3, which is more common in this ethnic-racial group. Also for haplogroup L3, the SNP p.V193I, a substitution in the ND2 gene, was associated with increased levels of amyloid plaque, a phenotype of Alzheimer’s disease (Tranah et al. 2014). Neuroblastoma arises during sympathetic neurogenesis. Neurogenesis and nervous system development pathways are suppressed in mitochondrial haplogroups in neuroblastoma and this can be associated with the underlying mechanism of reduced risk associated with mitochondrial haplogroups investigated in Chang et al. (2020). Due to its importance for dementia and neuroblastoma, computational methods need to be developed to quantify genetic risk associated with mitochondrial sequences, especially in non-US White populations, for which there is scarse data available. The choice of mitochondrial sequences considers an existing cross-talk between mitochondrial metabolism and activation of repetitive genomic elements (Baeken, Moosmann, and Hajieva 2020; Larsen et al. 2017; Stoccoro and Coppedè 2021; Lopes 2020; Bravo et al. 2020) that has implications in the role of mitochondrial haplogroups in activation of the immune system, as discussed by the work of Chang et al. (2020).

To allow contrasting demographic data on nervous system diseases in the context of the public health system of Brazil, we propose exploring mortality data from neuroblastoma and neurodegenerations in the public database of the Informatics Department of the Ministry of Health of Brazil (DATASUS). This will allow the development of computational tools capable of estimating genetic risk for different Brazilian racial groups, a long necessary approach to the health system of Brazil, based on the identification of mitochondrial variants that confer a higher risk of neuroblastoma and neurodegenerations, using data curated from the literature (Tranah et al. 2012; Tranah et al. 2014; Chang et al. (2020); Chang et al. 2022).

2.3 Management of public health data

We aim to improve management and decision-making processes in health systems by developing tools capable of processing and analyzing data generated by such systems, investigating the dynamics that affect mortality in various morbidities, such as those that affect the nervous system as is the case of neuroblastoma. In Brazil, the Federal Constitution of 1988 established the Unified Health System, and after that, the Department of Informatics of the SUS (DATASUS) was created to organize the data collected by the SUS (Saldanha, Rocha Bastos, and Barcellos, 2019). More recently, the Programa Genomas Brasil was implemented in the country, aiming to sequence 100,000 nationals to inform precision medicine policies in the public health system of SUS. Our group verified the persistence of racial inequality in the risk and survival of neuroblastoma (Chennakesavalu et al. 2023). Due to the genetic admixture present in Brazil since the beginning of colonization, a considerable part of the Brazilian population is of mixed race or black, which raises questions about the incidence, risk and treatment of neuroblastoma patients in Brazil considering their racial identification. Brazil is marked by profound socioeconomic inequalities related to the ethnic-racial origin of the population, which include but are not limited to digital literacy (Araújo da Silva and Behar 2019), use of programming languages (Sano et al. 2024; Vera-Choqqueccota et al. 2024) for genomic science, and racial inequality in health. The latter was recently identified by the Longitudinal Study of Adult Health (ELSA-Brazil) (ELSA Brazil 2023). The creation of databases linked to the SUS for genomic research must therefore take into account the impact of Brazilian genetic admixture on both access to digital literacy and the health of Brazilians themselves. Considering the requirement of public resources for genomic studies and genomic literacy of researchers and scientists in Brazil, we propose the Geralda framework as a concept to guide the integration of genomic information into the demographis database of SUS, as proposed in the Materials and Methods section.

3 Materials and Methods

3.1 Health System Mortality Data

The R package Microdatasus was used to access mortality data on neuroblastoma and neurodegenerative diseases as described by Freitas Saldanha et al. (2019).

3.2 Classification of Mitochondrial Haplogroups

Mitochondrial haplogroups were classified using Haplogrep3 as described in Schonherans et al. (2023). This classification algorithm generates a csv file that can be used with other R packages to understand the genetic risk associated with neuroblastoma and neurodegenerations for each haplogroup.

3.3 Estimation of Genetic Risk per Large Geographic Region

Risk was estimated and ploted using geobr as described by Pereira and Goncalves (2024). Geobr is a computational package to download official spatial data sets of Brazil. The package includes a wide range of geospatial data in geopackage format (like shapefiles), available at various geographic scales and for various years with harmonized attributes, projection and topology. This allows us to achieve a spatial-geographic organization of the data provided by the DATASUS department of the Ministry of Health.

3.4 General Genomic Algorithmic Data (Geralda) Management

The adoption of genomic information and epigenetic markers, such as the 5-hmC marker in neuroblastoma, into the SUS system involves challenges, which necessarily include activities to support digital literacy and the use of computer programming technologies in genomics throughout the country. Thus, this project includes, in the methodological part, collaborative genomic research between CIDACS in Brazil and my doctoral and postdoctoral institutions in the USA, to mediate the teaching of computer programming languages ​​for genomic research and the construction of the database for the application of machine learning to demographic data from the SUS. CIDACS proposed the harmonization of databases of social and health indicators, creating the Cohort of 100 million Brazilians, making important contributions to national health and epidemiology (Barreto et al. 2021). Integrating genomic information databases into the pipelines that allow investigations of the social and demographic data in Brazil can enrich and improve public policies of the nacional public health system of Brazil. This also has a potential to generate protocols for the use of machine learning and artificial intelligence in disease classification in the health system.

Figure 1: Framework of the GERALDA pipeline. Starting with fasta or fastq sequences, samples are aligned to the reference genome. Once VCF files are produced out of each sample, custom scripts are used to extract the genotypes of interest that will be used as features to inform the machine learning algorithm to classify discrete categories. Each haplotype thus identified is then used to label a racial group in the Microdatasus dataframe. This information is then used to estimate the genetic risk of each racial group in the DATASUS dataframe.

4 Results and Discussion

4.1 Classification of Brazilian Mitochondrial Haplogroups

Racial self-classification has social and political consequences in Brazil. Although it is acknowleged that the country has a historical genetic admixuture involving the Portuguese and other European, African and Native American populations, Brazil struggles with racial inequality in health according to Chor and Araujo Lima (2005). Racial classification has been difficult to achieve in Brazil because of the known genetic admixture. Among the parameters for racial classification, ancestry, self-identification or third person identification have been proposed. The IBGE adopted self-declaration for racial classification Chor and Araujo Lima (2005). Another racial classification system is the genetic or genomic ancestry. The genomic ancestry investigated using mitochondrial sequences can also inform the genetic risk for diseases of the nervous system such as neurodegenerations and neuroblastoma. To investigate the ancestry of the self-declared white group in Brazil, we used Haplogrep3 to classify the matrilinear lineage of self-declared white Brazilians aiming to quantify genetic risks for variants known to affect the nervous system (Figure 2).

Identification of mitochondrial haplogroups in sequences of self-declared white Brazilians, identified using the Haplogrep 3 tool (Schönherr, Weissensteiner, and Kronenberg 2023). The sample of mitochondrial sequences from Brazilians presents haplogroups J and K. These genotypes were related to the risk of dementia in the 2012 Tranah study in individuals of European ancestry (blue). In individuals of African ancestry (purple), haplogroup L1 is identified, which presents an increased risk of developing dementia, according to Tranah 2014. Also according to the 2014 Tranah study, the most common haplogroup among people of African ancestry, haplogroup L3, which is also observed in this sample from Brazil with 4 individuals (4 counts), presents higher levels of amyloid plaque deposition. This suggests that these individuals represent a risk group for the development of dementia among Brazilians.

Figure 2: Identification of mitochondrial haplogroups in sequences of self-declared white Brazilians, identified using the Haplogrep 3 tool (Schönherr, Weissensteiner, and Kronenberg 2023). The sample of mitochondrial sequences from Brazilians presents haplogroups J and K. These genotypes were related to the risk of dementia in the 2012 Tranah study in individuals of European ancestry (blue). In individuals of African ancestry (purple), haplogroup L1 is identified, which presents an increased risk of developing dementia, according to Tranah 2014. Also according to the 2014 Tranah study, the most common haplogroup among people of African ancestry, haplogroup L3, which is also observed in this sample from Brazil with 4 individuals (4 counts), presents higher levels of amyloid plaque deposition. This suggests that these individuals represent a risk group for the development of dementia among Brazilians.

Genotypes in Figure 2 are output of the Haplogrep 3 tool. They can be visualized in a table with the numbers of sequences per mitochondrial continental origin. A table describing the region of origin of each mitochondrial sequence analyzed can be visualized as follows.

# haplogroups.extended <- read.table("../../DNA Brasil/DNA do Brasil/matrilineal sequences/data/haplogroups 6 eur/haplogroups.extended.csv", header = T)

haplogroups.extended <- read.table("data/haplogroups/haplogroups.extended.csv", header = T)

regions <- read.table("../../ReComBio Scientific/geraldo/data/regions.txt", sep = "\t", header = T)
haplogroups.regions <- dplyr::left_join(haplogroups.extended, regions, by = c("SampleID" = "ID"))
haplogroups.regions <- haplogroups.regions %>% select(c(SampleID, Haplogroup, Origin, Region, Found_Polys))
cross_cases(haplogroups.regions, Origin, Region)
 Region 
 Northeast   South   Southeast 
 Origin 
   African  18 30
   Amerindian/Asian  7 15
   European  14 17
   #Total cases  39 17 45

We can take a look at the haplogroups.regions object, that originated the table above. We want to use this object to investigate the risk for neuroblastoma and neurodegenerative diseases per regions of Brazil based on the genotypes of the mitochondrial DNA sequences for each of the large region genotypes of the mitochondrial sequences.

mt_features_df <- read.table("../../ReComBio Scientific/geraldo/data/mt_dcast_gt_lab.txt", header = T)
haplogroups.features <- dplyr::left_join(haplogroups.regions, mt_features_df, by = c("SampleID" = "fastaID"))
haplogroups.features <- haplogroups.features %>% select(c(SampleID, Haplogroup, Origin, Region, MT_16093_bp, Found_Polys))

library(stringr)
# Split haplogroups
split_haplo_vector <- str_sub(haplogroups.features$Haplogroup, start = 1, end = 2)
# Add vector to DF, after column of interest
library(tibble)
haplogroups.features <- add_column(haplogroups.features, Genotype = split_haplo_vector, .after = "Haplogroup")
haplogroups.features <- haplogroups.features %>% select(-c(Haplogroup))

Let’s write a code to calculate the incidence of the mitochondrial variants in the large regions of Brazil. The code will count each of the haplogroups in Brazil as a whole:

df_selected <- haplogroups.features %>% dplyr::select(SampleID, Genotype, Region, Found_Polys)
kable(df_selected, caption="Found Polymorphisms") %>%
  kable_styling("striped", full_width = F, font_size = 12) %>%
  scroll_box(width = "100%", height = "600px")
Table 1: Table 2: Found Polymorphisms
SampleID Genotype Region Found_Polys
AF243627 A2 Northeast 152C! 16111T 16126C 16223T 16259T 16290T 16319A 16362C
AF243628 G1 Northeast 16223T 16325C 16362C
AF243629 B4 Northeast 16189C 16217C
AF243630 B2 Northeast 16189C 16217C 16249C 16312G 16344T
AF243631 A+ Northeast 16223T 16290T 16319A 16362C
AF243632 C1 Northeast 16223T 16298C 16325C 16327T 16362C
AF243633 M7 Northeast 16223T 16295T 16362C
AF243634 L1 Northeast 1438G! 15301A! 16126C 16129A! 16187T 16189C 16223T 16264T 16270T 16278T 16293G 16311C
AF243635 L3 Northeast 16176T 16223T 16327T
AF243636 M4 Northeast 6131G! 16223T 16294T 16294T
AF243637 L3 Northeast 16223T 16327T
AF243638 L0 Northeast 73G! 146C! 182T! 195C! 263G! 15301A! 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16278T! 16311C 16320T
AF243639 L2 Northeast 150T! 182T! 16189C 16192T 16223T 16278T 16294T 16309G 16311C!
AF243640 L3 Northeast 16124C 16223T
AF243641 L3 Northeast 16185T 16223T 16327T
AF243642 L2 Northeast 16223T 16264T 16278T 16311C!
AF243643 L2 Northeast 150T! 182T! 16189C 16223T 16225T 16234T 16278T 16294T 16309G 16311C!
AF243644 L1 Northeast 15301A! 16129A 16187T 16189C 16214T 16223T 16265C 16278T 16291T 16294T 16311C 16360T
AF243645 L1 Northeast 15301A! 16129A 16187T 16189C 16223T 16265C 16278T 16286G 16294T 16311C 16360T
AF243646 L2 Northeast 150T! 182T! 16223T 16278T 16294T 16309G 16311C!
AF243647 L0 Northeast 73G! 146C! 182T! 195C! 263G! 15301A! 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16278T! 16311C 16320T
AF243648 L3 Northeast 16124C 16223T
AF243649 L3 Northeast 16172C 16223T 16327T
AF243650 L2 Northeast 150T! 182T! 16223T 16278T 16294T 16309G 16311C!
AF243651 L3 Northeast 16129A 16209C 16223T 16292T 16295T 16311C
AF243652 H1 Northeast 16309G
AF243653 H1 Northeast 16362C
AF243654 J Northeast 16069T 16126C
AF243655 HV Northeast 16234T 16311C 16362C
AF243656 H1 Northeast 16075C 16189C 16356C
AF243657 K1 Northeast 16093C 16224C 16311C 16319A
AF243658 H7 Northeast 16221T
AF243659 K Northeast 16224C 16311C
AF243660 H2 Northeast
AF243661 H1 Northeast 16189C 16356C
AF243662 T2 Northeast 16126C 16294T 16296T 16304C
AF243663 H2 Northeast 16189C
AF243664 V7 Northeast 16153A 16298C
AF243665 H3 Northeast 16293G
AF243666 L3 Southeast 750G! 16223T 16265T
AF243667 L2 Southeast 16111A 16145A 16184T 16223T 16239T 16278T 16292T 16311C 16355T
AF243668 L3 Southeast 750G! 16223T 16265T
AF243669 L1 Southeast 15301A! 16086C 16129A 16187T 16189C 16223T 16241G 16274A 16278T 16291T 16293G 16294T 16311C 16360T
AF243670 L3 Southeast 16185T 16209C 16223T 16327T
AF243671 L2 Southeast 150T! 195C! 16223T 16224C 16278T 16311C!
AF243672 L2 Southeast 16223T 16264T 16278T 16311C 16311C
AF243673 L0 Southeast 73G! 146C! 182T! 185A! 195C! 263G! 15301A! 16129A! 16148T 16172C 16187T 16188G 16189C 16223T 16230G 16278T! 16311C 16320T
AF243674 M5 Southeast 16223T 16278T 16294T
AF243675 L1 Southeast 15301A! 16071T 16129A 16145A 16187T 16189C 16213A 16223T 16234T 16265C 16278T 16286G 16294T 16311C 16360T
AF243676 U6 Southeast 16172C 16189C 16219G 16278T
AF243677 U6 Southeast 16172C 16189C 16219G 16278T 16362C
AF243678 L4 Southeast 5460A! 16223T 16293T 16311C 16355T 16362C
AF243679 L2 Southeast 16114A 16129A 16213A 16223T 16278T 16311C!
AF243680 L1 Southeast 1438G! 15301A! 16126C 16129A! 16187T 16189C 16223T 16264T 16270T 16278T 16311C
AF243681 L4 Southeast 5460A! 16223T 16293T 16311C 16355T 16362C
AF243682 X1 Southeast 16104T 16189C 16223T 16278T!
AF243683 L1 Southeast 195C! 2283T! 7055G! 15301A! 16104T 16129A! 16163G 16187T 16189C 16223T 16278T 16293G 16294T 16311C 16360T
AF243684 L3 Southeast 10398G! 16185T 16223T 16327T!
AF243685 L0 Southeast 73G! 146C! 152C! 182T! 195C! 263G! 15301A! 16093C 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16278T 16278T 16293G 16311C 16320T
AF243686 L3 Southeast 16185T 16223T 16327T
AF243687 L1 Southeast 15301A! 16129A 16187T 16189C 16223T 16278T 16293G 16294T 16311C 16360T
AF243688 L1 Southeast 195C! 7055G! 15301A! 16129A 16163G 16187T 16189C 16209C 16223T 16278T 16293G 16294T 16311C 16360T
AF243689 L1 Southeast 15301A! 16086C! 16129A! 16189C 16223T 16278T 16293G 16294T 16311C 16360T
AF243690 L3 Southeast 16223T 16320T
AF243691 L3 Southeast 16172C 16189C 16223T 16320T
AF243692 M5 Southeast 16223T 16278T 16294T
AF243693 L3 Southeast 16172C 16189C 16223T 16311C 16320T
AF243694 L1 Southeast 198T! 10398G! 15301A! 16129A 16187T 16189C 16223T! 16278T 16293G 16294T 16311C 16360T
AF243695 L3 Southeast 16172C 16189C 16223T 16320T
AF243696 A+ Southeast 16189C 16223T 16290T 16319A 16362C
AF243697 A2 Southeast 152C! 16097C 16098G 16111T 16223T 16290T 16319A 16362C
AF243698 C1 Southeast 16223T 16325C 16327T
AF243699 A2 Southeast 152C! 16111T! 16126C 16223T 16278T 16290T 16319A 16362C
AF243700 A2 Southeast 152C! 16111T 16192T 16223T 16290T 16319A 16362C
AF243701 A8 Southeast 16223T 16242T 16290T 16319A
AF243702 B4 Southeast 16189C 16217C
AF243703 B4 Southeast 16189C 16217C
AF243704 G1 Southeast 16223T 16325C 16362C
AF243705 C1 Southeast 16223T 16298C 16325C 16327T
AF243706 A2 Southeast 152C! 16111T 16223T 16290T 16319A 16362C
AF243707 B4 Southeast 16189C 16217C
AF243708 C1 Southeast 16223T 16298C 16325C 16327T
AF243709 B4 Southeast 16189C 16217C
AF243710 A2 Southeast 152C! 16111T 16189C 16223T 16290T 16319A 16362C
AF243780 H5 South 16304C
AF243781 H1 South 16162G
AF243782 U5 South 16144C 16189C 16192T! 16270T
AF243783 T2 South 16126C 16153A 16294T 16296T
AF243784 H2 South
AF243785 J1 South 16069T 16126C 16261T
AF243786 H2 South 16124C 16354T
AF243787 H7 South 16213A
AF243788 T2 South 16126C 16147T 16294T 16296T 16297C 16304C
AF243789 H3 South 16093C
AF243790 X2 South 16189C 16223T 16248T 16278T
AF243791 HV South 16298C
AF243792 U5 South 16189C 16192T! 16270T
AF243793 K1 South 16224C 16311C 16319A
AF243794 U7 South 16309G 16318T
AF243795 J1 South 2706G! 16069T 16126C 16222T
AF243796 R0 South 16126C 16362C
haplogrupos_agregados <- df_selected %>% count(Genotype)
kable(haplogrupos_agregados, caption="Counts") %>%
  kable_styling("striped", full_width = F, font_size = 12) %>%
  scroll_box(width = "100%", height = "600px")
Table 1: Table 1: Counts
Genotype n
A+ 2
A2 6
A8 1
B2 1
B4 5
C1 4
G1 2
H1 5
H2 4
H3 2
H5 1
H7 2
HV 2
J 1
J1 2
K 1
K1 2
L0 4
L1 11
L2 9
L3 16
L4 2
M4 1
M5 2
M7 1
R0 1
T2 3
U5 2
U6 2
U7 1
V7 1
X1 1
X2 1

This part now will allow calculation of the rate of incidence of each of the haplogroups or the frequency of the genetic variants in the large regions of Brazil:

haplogrupos_agregados <- df_selected %>%
  group_by(Region) %>%
  count(Genotype) %>%
  mutate("N"=n())%>%
  mutate(freq=n/N)

kable(haplogrupos_agregados, caption="Mitochondrial Haplogroups per Region") %>%
  kable_styling("striped", full_width = F, font_size = 12) %>%
  scroll_box(width = "100%", height = "600px")
Table 3: Table 4: Mitochondrial Haplogroups per Region
Region Genotype n N freq
Northeast A+ 1 22 0.0454545
Northeast A2 1 22 0.0454545
Northeast B2 1 22 0.0454545
Northeast B4 1 22 0.0454545
Northeast C1 1 22 0.0454545
Northeast G1 1 22 0.0454545
Northeast H1 4 22 0.1818182
Northeast H2 2 22 0.0909091
Northeast H3 1 22 0.0454545
Northeast H7 1 22 0.0454545
Northeast HV 1 22 0.0454545
Northeast J 1 22 0.0454545
Northeast K 1 22 0.0454545
Northeast K1 1 22 0.0454545
Northeast L0 2 22 0.0909091
Northeast L1 3 22 0.1363636
Northeast L2 5 22 0.2272727
Northeast L3 7 22 0.3181818
Northeast M4 1 22 0.0454545
Northeast M7 1 22 0.0454545
Northeast T2 1 22 0.0454545
Northeast V7 1 22 0.0454545
South H1 1 13 0.0769231
South H2 2 13 0.1538462
South H3 1 13 0.0769231
South H5 1 13 0.0769231
South H7 1 13 0.0769231
South HV 1 13 0.0769231
South J1 2 13 0.1538462
South K1 1 13 0.0769231
South R0 1 13 0.0769231
South T2 2 13 0.1538462
South U5 2 13 0.1538462
South U7 1 13 0.0769231
South X2 1 13 0.0769231
Southeast A+ 1 14 0.0714286
Southeast A2 5 14 0.3571429
Southeast A8 1 14 0.0714286
Southeast B4 4 14 0.2857143
Southeast C1 3 14 0.2142857
Southeast G1 1 14 0.0714286
Southeast L0 2 14 0.1428571
Southeast L1 8 14 0.5714286
Southeast L2 4 14 0.2857143
Southeast L3 9 14 0.6428571
Southeast L4 2 14 0.1428571
Southeast M5 2 14 0.1428571
Southeast U6 2 14 0.1428571
Southeast X1 1 14 0.0714286

This other visualization table depicts the visualization of a SNP (single nucleotide polymorphism in the last column):

Table 5: Table 6: Mitochondrial Haplogroups per Region
SampleID Genotype Origin Region MT_16093_bp Found_Polys
AF243627 A2 Amerindian/Asian Northeast NA 152C! 16111T 16126C 16223T 16259T 16290T 16319A 16362C
AF243628 G1 Amerindian/Asian Northeast NA 16223T 16325C 16362C
AF243629 B4 Amerindian/Asian Northeast NA 16189C 16217C
AF243630 B2 Amerindian/Asian Northeast NA 16189C 16217C 16249C 16312G 16344T
AF243631 A+ Amerindian/Asian Northeast NA 16223T 16290T 16319A 16362C
AF243632 C1 Amerindian/Asian Northeast NA 16223T 16298C 16325C 16327T 16362C
AF243633 M7 Amerindian/Asian Northeast NA 16223T 16295T 16362C
AF243634 L1 African Northeast 0 1438G! 15301A! 16126C 16129A! 16187T 16189C 16223T 16264T 16270T 16278T 16293G 16311C
AF243635 L3 African Northeast 0 16176T 16223T 16327T
AF243636 M4 African Northeast 0 6131G! 16223T 16294T 16294T
AF243637 L3 African Northeast 0 16223T 16327T
AF243638 L0 African Northeast 0 73G! 146C! 182T! 195C! 263G! 15301A! 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16278T! 16311C 16320T
AF243639 L2 African Northeast 0 150T! 182T! 16189C 16192T 16223T 16278T 16294T 16309G 16311C!
AF243640 L3 African Northeast 1 16124C 16223T
AF243641 L3 African Northeast 0 16185T 16223T 16327T
AF243642 L2 African Northeast 0 16223T 16264T 16278T 16311C!
AF243643 L2 African Northeast 0 150T! 182T! 16189C 16223T 16225T 16234T 16278T 16294T 16309G 16311C!
AF243644 L1 African Northeast 0 15301A! 16129A 16187T 16189C 16214T 16223T 16265C 16278T 16291T 16294T 16311C 16360T
AF243645 L1 African Northeast 1 15301A! 16129A 16187T 16189C 16223T 16265C 16278T 16286G 16294T 16311C 16360T
AF243646 L2 African Northeast 0 150T! 182T! 16223T 16278T 16294T 16309G 16311C!
AF243647 L0 African Northeast 0 73G! 146C! 182T! 195C! 263G! 15301A! 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16278T! 16311C 16320T
AF243648 L3 African Northeast 0 16124C 16223T
AF243649 L3 African Northeast 0 16172C 16223T 16327T
AF243650 L2 African Northeast 0 150T! 182T! 16223T 16278T 16294T 16309G 16311C!
AF243651 L3 African Northeast NA 16129A 16209C 16223T 16292T 16295T 16311C
AF243652 H1 European Northeast NA 16309G
AF243653 H1 European Northeast NA 16362C
AF243654 J European Northeast NA 16069T 16126C
AF243655 HV European Northeast NA 16234T 16311C 16362C
AF243656 H1 European Northeast NA 16075C 16189C 16356C
AF243657 K1 European Northeast NA 16093C 16224C 16311C 16319A
AF243658 H7 European Northeast NA 16221T
AF243659 K European Northeast NA 16224C 16311C
AF243660 H2 European Northeast NA
AF243661 H1 European Northeast NA 16189C 16356C
AF243662 T2 European Northeast NA 16126C 16294T 16296T 16304C
AF243663 H2 European Northeast NA 16189C
AF243664 V7 European Northeast NA 16153A 16298C
AF243665 H3 European Northeast NA 16293G
AF243666 L3 African Southeast NA 750G! 16223T 16265T
AF243667 L2 African Southeast NA 16111A 16145A 16184T 16223T 16239T 16278T 16292T 16311C 16355T
AF243668 L3 African Southeast NA 750G! 16223T 16265T
AF243669 L1 African Southeast NA 15301A! 16086C 16129A 16187T 16189C 16223T 16241G 16274A 16278T 16291T 16293G 16294T 16311C 16360T
AF243670 L3 African Southeast NA 16185T 16209C 16223T 16327T
AF243671 L2 African Southeast NA 150T! 195C! 16223T 16224C 16278T 16311C!
AF243672 L2 African Southeast NA 16223T 16264T 16278T 16311C 16311C
AF243673 L0 African Southeast NA 73G! 146C! 182T! 185A! 195C! 263G! 15301A! 16129A! 16148T 16172C 16187T 16188G 16189C 16223T 16230G 16278T! 16311C 16320T
AF243674 M5 African Southeast NA 16223T 16278T 16294T
AF243675 L1 African Southeast NA 15301A! 16071T 16129A 16145A 16187T 16189C 16213A 16223T 16234T 16265C 16278T 16286G 16294T 16311C 16360T
AF243676 U6 African Southeast NA 16172C 16189C 16219G 16278T
AF243677 U6 African Southeast NA 16172C 16189C 16219G 16278T 16362C
AF243678 L4 African Southeast NA 5460A! 16223T 16293T 16311C 16355T 16362C
AF243679 L2 African Southeast NA 16114A 16129A 16213A 16223T 16278T 16311C!
AF243680 L1 African Southeast NA 1438G! 15301A! 16126C 16129A! 16187T 16189C 16223T 16264T 16270T 16278T 16311C
AF243681 L4 African Southeast NA 5460A! 16223T 16293T 16311C 16355T 16362C
AF243682 X1 African Southeast NA 16104T 16189C 16223T 16278T!
AF243683 L1 African Southeast NA 195C! 2283T! 7055G! 15301A! 16104T 16129A! 16163G 16187T 16189C 16223T 16278T 16293G 16294T 16311C 16360T
AF243684 L3 African Southeast NA 10398G! 16185T 16223T 16327T!
AF243685 L0 African Southeast NA 73G! 146C! 152C! 182T! 195C! 263G! 15301A! 16093C 16129A 16148T 16168T 16172C 16187T 16188G 16189C 16223T 16230G 16278T 16278T 16293G 16311C 16320T
AF243686 L3 African Southeast NA 16185T 16223T 16327T
AF243687 L1 African Southeast NA 15301A! 16129A 16187T 16189C 16223T 16278T 16293G 16294T 16311C 16360T
AF243688 L1 African Southeast NA 195C! 7055G! 15301A! 16129A 16163G 16187T 16189C 16209C 16223T 16278T 16293G 16294T 16311C 16360T
AF243689 L1 African Southeast NA 15301A! 16086C! 16129A! 16189C 16223T 16278T 16293G 16294T 16311C 16360T
AF243690 L3 African Southeast NA 16223T 16320T
AF243691 L3 African Southeast NA 16172C 16189C 16223T 16320T
AF243692 M5 African Southeast NA 16223T 16278T 16294T
AF243693 L3 African Southeast NA 16172C 16189C 16223T 16311C 16320T
AF243694 L1 African Southeast NA 198T! 10398G! 15301A! 16129A 16187T 16189C 16223T! 16278T 16293G 16294T 16311C 16360T
AF243695 L3 African Southeast NA 16172C 16189C 16223T 16320T
AF243696 A+ Amerindian/Asian Southeast NA 16189C 16223T 16290T 16319A 16362C
AF243697 A2 Amerindian/Asian Southeast NA 152C! 16097C 16098G 16111T 16223T 16290T 16319A 16362C
AF243698 C1 Amerindian/Asian Southeast NA 16223T 16325C 16327T
AF243699 A2 Amerindian/Asian Southeast NA 152C! 16111T! 16126C 16223T 16278T 16290T 16319A 16362C
AF243700 A2 Amerindian/Asian Southeast NA 152C! 16111T 16192T 16223T 16290T 16319A 16362C
AF243701 A8 Amerindian/Asian Southeast NA 16223T 16242T 16290T 16319A
AF243702 B4 Amerindian/Asian Southeast NA 16189C 16217C
AF243703 B4 Amerindian/Asian Southeast NA 16189C 16217C
AF243704 G1 Amerindian/Asian Southeast NA 16223T 16325C 16362C
AF243705 C1 Amerindian/Asian Southeast NA 16223T 16298C 16325C 16327T
AF243706 A2 Amerindian/Asian Southeast NA 152C! 16111T 16223T 16290T 16319A 16362C
AF243707 B4 Amerindian/Asian Southeast NA 16189C 16217C
AF243708 C1 Amerindian/Asian Southeast NA 16223T 16298C 16325C 16327T
AF243709 B4 Amerindian/Asian Southeast NA 16189C 16217C
AF243710 A2 Amerindian/Asian Southeast NA 152C! 16111T 16189C 16223T 16290T 16319A 16362C
AF243780 H5 European South 0 16304C
AF243781 H1 European South 0 16162G
AF243782 U5 European South 0 16144C 16189C 16192T! 16270T
AF243783 T2 European South 0 16126C 16153A 16294T 16296T
AF243784 H2 European South 0
AF243785 J1 European South 0 16069T 16126C 16261T
AF243786 H2 European South 0 16124C 16354T
AF243787 H7 European South 0 16213A
AF243788 T2 European South 0 16126C 16147T 16294T 16296T 16297C 16304C
AF243789 H3 European South 1 16093C
AF243790 X2 European South 0 16189C 16223T 16248T 16278T
AF243791 HV European South 0 16298C
AF243792 U5 European South 0 16189C 16192T! 16270T
AF243793 K1 European South 0 16224C 16311C 16319A
AF243794 U7 European South 0 16309G 16318T
AF243795 J1 European South 0 2706G! 16069T 16126C 16222T
AF243796 R0 European South 0 16126C 16362C


These results show that the identity of self-declared white individuals in Brazil is not limited to individuals of European matrilineal ancestry, in possible agreement with the ideology of racial democracy proposed by Gilberto Freire. It also shows that individuals that self-identify as white, as well as public governmental policies should consider the genetic risk associated with African variants in individuals that self-identify as white in Brazil. At this point it is not possible to establish the cause-effect scenario but higher mortality among self-declared white individuals due to neuroblastoma is predominant in white individuals, either because self-declared white individuals most often use the public health system (an evidence of structural racism) or because the genetic risk variants are as well present in self-declared white individuals. We calculate an incidence of 30% for haplogroup L3 in the Northeast region (total 17 samples analyzed from Brazil) (Table above). For the 17 samples, the L3 haplogroup is not present among self-declared white individuals in the Southeastern and South of Brazil. The incidence of mitochondrial haplogroups of African ancestry is observed in self-declared white Brazilians (Figure 7), consistent with Fridman’s (2014) observation of 35% African matrilineal lineage in self-declared white individuals in Brazil.

On the other hand, if haplogroups L1 and L3 confer a greater risk to individuals identified as white for diseases of the nervous system, for neuroblastoma, haplogroup K, of European origin, presents a reduced risk of neuroblastoma according to Chang 2020, suggesting that Brazilians who self-declare as white and who have the European mitochondrial genotype are less susceptible to neuroblastoma than individuals of African matrilineal lineage (Figure 7).

4.2 Applying genomic model to public health system

4.2.1. Neuroblastoma mortality in Brazil

To apply models such as the one described in Chaves et al., 2024 (in preparation) to genomic research in diseases of the nervous system in Brazilians, we propose a collaborative genomic research between UCSC and UChicago in the USA and CIDACS at Instituto Gonçalo Moniz in Brazil. To allow the collaboratoin, we accessed neuroblastoma mortality data from Brazilians, using the Microdatasus R package. We found a predominance of deaths of self-declared white individuals in the first decade of the 2000s (Figure 3).

The prevalence of self-declared white individuals in neuroblastoma mortality data in Brazil can be attributed to the ideology of “whitening” (Pena et al. 2011) in the Brazilian racial identity (Mitchell 2022) and to gender asymmetry in interracial relationships in Brazil (Pena 2007). After 2010, a decrease in the proportion of mortality of self-declared white individuals was observed (Figure 3).

In the code below we plot the mortality rate of neuroblastoma in Brazil using the dados_nb_appended_melted object that was saved somewhere else.

library(ggplot2)
nb_mortality_df <- readRDS("../../ReComBio Scientific/geraldo/data/dados_nb_appended_melted.rds")
p_nb <- nb_mortality_df %>% 
  ggplot(aes(x = year, y = value, 
             fill = raca_cor_factor)) + 
  geom_bar(position="fill", stat="identity")
ggplotly(p_nb)

Figure 3: Annual mortality due to neuroblastoma in a sample of the Brazilian population between 2000-2015, made available by the SUS Information Technology Department, DATASUS. Mortality is calculated using the R language through the Microdatasus library. Artwork by @allison_horst

Since the TET1/5-hmC hypomethylation genomic model can only be applied to Brazilians after sequencing 5-hmC from cfDNA of Brazilian patients, the current MVP is a proof of concept and analyzes mitochondrial DNA of Brazilians published by Alves-Silva et al., 2000.

The prevalence of self-declared white individuals in the neuroblastoma mortality data (Figure 3) is in line with the African ancestry (indicated in purple) of haplogroups L1 and L3 in self declared white individuals in Brazil and a higher risk associated of the incidence of nervous system diseases in these haplogroups according to Tranah (Tranah et al. 2012; Tranah et al. 2014) (Figure 2). Because we identified genetic variants associated with higher risk of diseases in the nervous system in the Brazilian mitochondrial sequences, we then decided to look at the incidence of these mitochondrial haplogroups per large geographic region of Brazil, using the geographic information present in the Alves-Silva et al. (2000) study. To do that we accessed geographic information about the large regions and states of the Brazilian territory using the R package geobr and the read_state function as follows:

data_estados_brancos_perct <- readRDS("../../ReComBio Scientific/geraldo/data/data_estados_brancos_perct.rds")

# read all states
states <- read_state(
  year = 2019, 
  showProgress = FALSE
  )

# Join the states and data_estados_brancos_perct databases
states_perct_brancos <- dplyr::left_join(states, data_estados_brancos_perct, by = c("abbrev_state" = "UF"))

To begin looking into the spatial and geographic information of the demographic data stored by DATASUS, we visualized the mortality rate of self-declared white individuals in the states of Brazil as follows. We obtained the number of self-declared white individuals that have dementia risk estimated in each State. With this number, we can estimate the number of self-declared white individuals reported as passing away by the Unfied Health System (SUS) in 2014. We can observe theat proportionally, more self-reported white individuals passed away in Southern Brazil than any other large region:

# states_perct_brancos <- readRDS("./data/states_perct_brancos.rds")
ggplot() +
  geom_sf(data=states_perct_brancos, aes(fill=brancos), 
          color= "black", size=.15) + ## Color here is the line of the border
    labs(subtitle="", size=8) +
    scale_fill_distiller(palette = "Blues", name="Mortality\nRate", direction=+1, 
                         limits = c(0,1)) +
    theme_minimal() #+no_axis
Frequency of mortality of self-declared white individuals in Brazil

Figure 4: Frequency of mortality of self-declared white individuals in Brazil

Having established where the highest incidence of self-reported white individuals mortality rate is, we begin accessing the incidence of the mitochondrial haplogroups that confer protection or risk of diseases in the nervous system. In Table 3 we observe the incidence of each haplogroup identified in the mitochondrial sequences isolated from self-declared white Brazilians (Table 3). Column Region in that table, can be used to merge the genotype information to the demographic information contained in this other table extracted using the Microdatasus R package.

dados_estados_nb_2013 <- readRDS("../../R/R journal/data/dados_estados_nb_2013.rds")
dim(dados_estados_nb_2013)
[1] 302   8
# kable(dados_estados_nb_2013, caption="dados_estados_nb_2013") %>%
#   kable_styling("striped", full_width = F, font_size = 12) %>%
#   scroll_box(width = "100%", height = "600px")

head(dados_estados_nb_2013)
      CONTADOR RACACOR  DTOBITO CAUSABAS   DTNASC IDADE UF    Region
6222      6222       1 23022013     C749 30122003   409 TO     North
31026    24155       1 16022013     C749 09022012   401 SP Southeast
44075    37204    <NA> 09042013     C749 25102009   403 SP Southeast
44097    37226       2 08082013     C749 03042012   401 SP Southeast
46745    39874       1 10012013     C749 22031953   459 SP Southeast
48051    41180       1 19012013     C749 02032011   401 SP Southeast

Note that both Table 3 and dados_estados_nb_2013 have a column named Regions, depicting the large Regions of Brazil. A column similar to this column can be used to store information about State, Municipality, City and local information about the health unit that is serving the patient in the public health system.

4.2.2. Genetic risk estimation

After estimating mortality by race and region in Brazil (Figure 4), we begin estimating the incidence of the mitochondrial haplogroups by geographic regions. These haplogroups were reported to associate of genetic risk for diseases of the nervous system. We can now estimate which mitochondrial lineages are exposed to the highest risks along the territory of Brazil.

Haplogroup J

This haplogroup is predominant in Southern Brazil, with significant presence in the Northeast. Haplogroup J was reported by Tranah et al. (2012) to be associated with cognitive impairment. Feder et al. (2008) identified that this haplogroup also associates with type 2 diabetes in Ashkenazi Jews, a disease known to be more frequent in people that have neurodegenerative diseases (Chaves et al., 2019).

Estimation of incidence of mitochondrial haplogroup J in populations of the large regions Brazil.

Figure 5: Estimation of incidence of mitochondrial haplogroup J in populations of the large regions Brazil.

Haplogroup K

This haplogroup was found predominant in Southern and Northeasthern Brazil in this study. Chang et al. (2020) reported that haplogroup K protects against neuroblastoma and is associated with protection against the righ risk neuroblastoma disease. It is possible that this haplogroup is associated with increased inflammatory response and T-cell infiltration in hot neuroblastoma tumors via mitochondrial reprogramming of metabolism in cancer and the immunological cells.

Estimation of incidence of mitochondrial haplogroup K in populations of the large regions Brazil.

Figure 6: Estimation of incidence of mitochondrial haplogroup K in populations of the large regions Brazil.

Haplogroup L3

This haplogroup is predominant in Southeastern and Northeastern Brazil. Haplogroup L3 was reported by Tranah et al. (2014) to be associated with cognitive impairment in African Americans. It is possible that this haplogroup is associated with increased mortality by neuroblastoma in self-declared white Brazilians of mitochondrial African ancestry.

Estimation of incidence of  mitochondrial haplogroup L3 in populations of the large regions Brazil.

Figure 7: Estimation of incidence of mitochondrial haplogroup L3 in populations of the large regions Brazil.

Haplogroup T

Haplogroup T was detected in samples from Northeastern and Southern Brazil. According to a study by Kofler et al. (2009), mitochondrial DNA haplogroup T is associated with coronary artery disease and diabetic retinopathy. Chang et al. (2020) also reported association between haplogroup T and neuroblastoma.

Estimation of incidence of  mitochondrial haplogroup T in populations of the large regions Brazil.

Figure 8: Estimation of incidence of mitochondrial haplogroup T in populations of the large regions Brazil.

5 Conclusions

X. Chang, M. Bakay, Y. Liu, J. Glessner, K. S. Rathi, J. M. Maris and H. Hakonarson. Mitochondrial DNA haplogroups and susceptibility to neuroblastoma. 2020. JNCI J Natl Cancer Inst (2020) 112(12): djaa024.
D. Chor and C. R. de Araujo Lima. Aspectos epidemiológicos das desigualdades raciais em saúde no brasil. 2005. Cad. Saúde Pública, Rio de Janeiro, 21(5):1586-1594, set-out, 2005.
J. Feder, I. Blech, O. Ovadia, S. Amar, J. Wainstein, I. Raz and D. Mishmar. Differences in mtDNA haplogroup distribution among 3 jewish populations alter susceptibility to T2DM complications. 2008. BMC Genomics 2008, 9:198 doi:10.1186/1471-2164-9-198.
R. de Freitas Saldanha, R. R. Bastos and C. Barcellos. Microdatasus: A package for downloading and preprocessing microdata from brazilian health informatics department (DATASUS). 2019. URL 10.1590/0102-311X00032419. Microdatasus.
T. van Groningen, J. Koster, L. J. Valentijn, D. A. Zwijnenburg, B. A. Westerman, J. van Nes and R. Versteeg. Neuroblastoma is composed of two super-enhancer-associated differentiation states. 2017. URL https://doi.org/10.1038/ng.3899. van Groningen paper.
B. Kofler, E. E. Mueller, W. Eder, O. Stanger, R. Maier, M. Weger, F. A. Zimmermann, J. A. Mayr and W. Sperl. Mitochondrial DNA haplogroup t is associated with coronary artery disease and diabetic retinopathy: A case control study. 2009. BMC Medical Genetics 2009, 10:35 doi:10.1186/1471-2350-10-35.
R. H. M. Pereira and C. N. Goncalves. Geobr: Download official spatial data sets of brazil. 2024. URL https://ipeagit.github.io/geobr/. R package version 1.9.1, https://github.com/ipeaGIT/geobr.
S. Schonherans, H. Weissensteine, F. Kronenberg and L. Forer. Haplogrep 3 - an interactive haplogroup classification and analysis platform. 2023. URL https://doi.org/10.1093/nar/gkad284. Haplogrep3.
G. J. Tranah, M. A. Nalls, S. M. Katzman, J. S. Yokoyama, E. T. Lam, Y. Zhao and S. Mooney. Mitochondrial DNA sequence variation associated with dementia and cognitive function in the elderly. 2012. J Alzheimers Dis. 2012 ; 32(2): 357–372. doi:10.3233/JAD-2012-120466.
G. J. Tranah, J. S. Yokoyama, S. M. Katzman, M. A. Nalls, A. B. Newman and T. B. Harris. Mitochondrial DNA sequence associations with dementia and amyloid-b in elderly african americans. 2014. Neurobiology of Aging 35 (2014) 442.e1e442.e8.

References

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

Chaves & Ramos, "GERALDA: A framework for integration of genomic information into the database of the Brazilian Health System", The R Journal, 2024

BibTeX citation

@article{quokka-bilby,
  author = {Chaves, Gepoliano and Ramos, Pablo Ivan},
  title = {GERALDA: A framework for integration of genomic information into the database of the Brazilian Health System},
  journal = {The R Journal},
  year = {2024},
  issn = {2073-4859},
  pages = {1}
}